Goto

Collaborating Authors

 physics simulation




DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via Physics Simulation

Neural Information Processing Systems

This paper addresses the task of 3D pose estimation for a hand interacting with an object from a single image observation. When modeling hand-object interaction, previous works mainly exploit proximity cues, while overlooking the dynamical nature that the hand must stably grasp the object to counteract gravity and thus preventing the object from slipping or falling. These works fail to leverage dynamical constraints in the estimation and consequently often produce unstable results. Meanwhile, refining unstable configurations with physics-based reasoning remains challenging, both by the complexity of contact dynamics and by the lack of effective and efficient physics inference in the data-driven learning framework. To address both issues, we present DeepSimHO: a novel deep-learning pipeline that combines forward physics simulation and backward gradient approximation with a neural network. Specifically, for an initial hand-object pose estimated by a base network, we forward it to a physics simulator to evaluate its stability. However, due to non-smooth contact geometry and penetration, existing differentiable simulators can not provide reliable state gradient. To remedy this, we further introduce a deep network to learn the stability evaluation process from the simulator, while smoothly approximating its gradient and thus enabling effective back-propagation. Extensive experiments show that our method noticeably improves the stability of the estimation and achieves superior efficiency over test-time optimization.


MIMIC-MJX: Neuromechanical Emulation of Animal Behavior

Zhang, Charles Y., Yang, Yuanjia, Sirbu, Aidan, Abe, Elliott T. T., Wärnberg, Emil, Leonardis, Eric J., Aldarondo, Diego E., Lee, Adam, Prasad, Aaditya, Foat, Jason, Bian, Kaiwen, Park, Joshua, Bhatt, Rusham, Saunders, Hutton, Nagamori, Akira, Thanawalla, Ayesha R., Huang, Kee Wui, Plum, Fabian, Beck, Hendrik K., Flavell, Steven W., Labonte, David, Richards, Blake A., Brunton, Bingni W., Azim, Eiman, Ölveczky, Bence P., Pereira, Talmo D.

arXiv.org Artificial Intelligence

The primary output of the nervous system is movement and behavior. While recent advances have democratized pose tracking during complex behavior, kinematic trajectories alone provide only indirect access to the underlying control processes. Here we present MIMIC-MJX, a framework for learning biologically-plausible neural control policies from kinematics. MIMIC-MJX models the generative process of motor control by training neural controllers that learn to actuate biomechanically-realistic body models in physics simulation to reproduce real kinematic trajectories. We demonstrate that our implementation is accurate, fast, data-efficient, and generalizable to diverse animal body models. Policies trained with MIMIC-MJX can be utilized to both analyze neural control strategies and simulate behavioral experiments, illustrating its potential as an integrative modeling framework for neuroscience.


Unreal Robotics Lab: A High-Fidelity Robotics Simulator with Advanced Physics and Rendering

Embley-Riches, Jonathan, Liu, Jianwei, Julier, Simon, Kanoulas, Dimitrios

arXiv.org Artificial Intelligence

High-fidelity simulation is essential for robotics research, enabling safe and efficient testing of perception, control, and navigation algorithms. However, achieving both photorealistic rendering and accurate physics modeling remains a challenge. This paper presents a novel simulation framework, the Unreal Robotics Lab (URL), that integrates the advanced rendering capabilities of the Unreal Engine with MuJoCo's high-precision physics simulation. Our approach enables realistic robotic perception while maintaining accurate physical interactions, facilitating benchmarking and dataset generation for vision-based robotics applications. The system supports complex environmental effects, such as smoke, fire, and water dynamics, which are critical to evaluating robotic performance under adverse conditions. We benchmark visual navigation and SLAM methods within our framework, demonstrating its utility for testing real-world robustness in controlled yet diverse scenarios. By bridging the gap between physics accuracy and photorealistic rendering, our framework provides a powerful tool for advancing robotics research and sim-to-real transfer. Our open-source framework is available at https://unrealroboticslab.github.io/.


Efficient Dynamic and Momentum Aperture Optimization for Lattice Design Using Multipoint Bayesian Algorithm Execution

Zhang, Z., Agapov, I., Gasiorowski, S., Hellert, T., Neiswanger, W., Huang, X., Ratner, D.

arXiv.org Artificial Intelligence

University of Southern California, Los Angeles, CA 90089 (Dated: November 25, 2025) We demonstrate that multipoint Bayesian algorithm execution can overcome fundamental computational challenges in storage ring design optimization. Dynamic (DA) and momentum (MA) optimization is a multipoint, multiobjective design task for storage rings, ultimately informing the flux of x-ray sources and luminosity of colliders. We remove this bottleneck using multipointBAX, which selects, simulates, and models each trial configuration at the single particle level. We demonstrate our approach on a novel design for a fourth-generation light source, with neural-network powered multipointBAX achieving equivalent Pareto front results using more than two orders of magnitude fewer tracking computations compared to genetic algorithms. The significant reduction in cost positions multipointBAX as a promising alternative to black-box optimization, and we anticipate multipointBAX will be instrumental in the design of future light sources, colliders, and large-scale scientific facilities. Designing modern scientific facilities -- from synchrotron light sources to particle colliders -- requires optimizing hundreds of parameters in a complex, nonlinear systems, where a single design evaluation can take hours of computation. In storage rings, this challenge is exemplified by dynamic aperture (DA) and momentum aperture (MA) optimization, where maximizing the regions of particle stability directly determines injection efficiency, beam lifetime, and ultimately the photon flux or luminosity achievable in next-generation facilities. The computational bottleneck is severe: maximizing DA and MA is a type of multipoint optimization, where evaluating a single lattice design requires tracking tens of thousands of particles for hundreds of thousands of turns, making global optimization prohibitively expensive. Moreover, there is a trade-off between maximizing DA and MA area, so the standard approach is to find a Pareto front; i.e.


DexCanvas: Bridging Human Demonstrations and Robot Learning for Dexterous Manipulation

Xu, Xinyue, Sun, Jieqiang, Jing, null, Dai, null, Chen, Siyuan, Ma, Lanjie, Sun, Ke, Zhao, Bin, Yuan, Jianbo, Yi, Sheng, Zhu, Haohua, Lu, Yiwen

arXiv.org Artificial Intelligence

We present DexCanvas, a large-scale hybrid real-synthetic human manipulation dataset containing 7,000 hours of dexterous hand-object interactions seeded from 70 hours of real human demonstrations, organized across 21 fundamental manipulation types based on the Cutkosky taxonomy (Feix et al., 2016). Each entry combines synchronized multi-view RGB-D, high-precision mocap with MANO hand parameters, and per-frame contact points with physically consistent force profiles. Our real-to-sim pipeline uses reinforcement learning to train policies that control an actuated MANO hand in physics simulation, reproducing human demonstrations while discovering the underlying contact forces that generate the observed object motion. DexCanvas is the first manipulation dataset to combine large-scale real demonstrations, systematic skill coverage based on established taxonomies, and physics-validated contact annotations. The dataset can facilitate research in robotic manipulation learning, contact-rich control, and skill transfer across different hand morphologies. Dexterous manipulation with high-DoF anthropomorphic hands is fundamental to robot learning: it enables the most general form of object interaction and is essential for robots to achieve human-level autonomy in unstructured environments (Y u & Wang, 2022; Ozawa & Tahara, 2017). The field has witnessed rapid advancement along two dimensions: diverse learning paradigms including reinforcement learning for contact-rich control (Chen et al., 2024; 2023) and diffusion-based methods for handling multimodal action distributions (Weng et al., 2024; Wu et al., 2024), alongside dramatic scale expansion from task-specific models to billion-parameter foundation models (Wen et al., 2025; Kim et al., 2024; Zitkovich et al., 2023). However, current flagship manipulation systems predominantly rely on parallel-jaw grippers, while generalizable control of anthropomorphic hands remains limited to simulation or narrow real-world scenarios. This gap highlights an opportunity: to unlock the full potential of dexterous manipulation, we need large-scale datasets that capture diverse human manipulation strategies with physically accurate contact dynamics and force profiles, the crucial signals for learning robust dexterous control. Building such datasets requires careful consideration of data sources and collection methodologies. The choice between robot-generated and human-sourced data presents fundamental tradeoffs for learning manipulation.




DiffAero: A GPU-Accelerated Differentiable Simulation Framework for Efficient Quadrotor Policy Learning

Zhang, Xinhong, Wang, Runqing, Ren, Yunfan, Sun, Jian, Fang, Hao, Chen, Jie, Wang, Gang

arXiv.org Artificial Intelligence

Abstract-- This letter introduces DiffAero, a lightweight, GPU-accelerated, and fully differentiable simulation framework designed for efficient quadrotor control policy learning. Dif-fAero supports both environment-level and agent-level parallelism and integrates multiple dynamics models, customizable sensor stacks (IMU, depth camera, and LiDAR), and diverse flight tasks within a unified, GPU-native training interface. By fully parallelizing both physics and rendering on the GPU, DiffAero eliminates CPU-GPU data transfer bottlenecks and delivers orders-of-magnitude improvements in simulation throughput. In contrast to existing simulators, DiffAero not only provides high-performance simulation but also serves as a research platform for exploring differentiable and hybrid learning algorithms. Extensive benchmarks and real-world flight experiments demonstrate that DiffAero and hybrid learning algorithms combined can learn robust flight policies in hours on consumer-grade hardware. Quadrotors--and swarms of quadrotors thereof--are increasingly deployed in complex environments for aerial inspection, environmental monitoring, and high-speed racing, owing to their agile maneuverability and onboard sensing capabilities. End-to-end learning addresses these limitations by training neural flight policies that map raw sensor observations directly to control commands, thereby streamlining the autonomy stack and enabling tighter feedback loops [4].